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Abstract 

A cyclic urn is an urn model for balls of types 0,..., m — 1 where in each draw the 
ball drawn, say of type j, is returned to the urn together with a new ball of type j + 1 
mod m. The case m = 2 is the well-known Friedman urn. The composition vector, i.e., 
the vector of the numbers of balls of each type after n steps is, after normalization, known 
to be asymptotically normal for 2 < m < 6. For m > 7 the normalized composition 
vector does not converge. However, there is an almost sure approximation by a periodic 
random vector. In this paper the asymptotic fluctuations around this periodic random 
vector are identified. We show that these fluctuations are asymptotically normal for all 
m > 7. However, they are of maximal dimension m — 1 only when 6 does not divide m. 

For m being a multiple of 6 the fluctuations are supported by a two-dimensional subspace. 

MSC2010: 60F05, 60F15, 60C05, 60J10. 
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1 Introduction, phenomena and results 

The aim of this extended abstract is to uncover the nature of fluctuations around almost surely 
oscillating sequences of random variables as they arise in a number of random combinatorial 
structures, most commonly in random trees. We develop an analysis for the composition 
vector of cyclic urns and describe at this example the new phenomena and characteristics of 
the fine fluctuations around a random oscillating sequence which (in an almost sure sense) 
approximates the normalized composition vector of a cyclic urn. 

A cyclic urn is an urn model with a fixed number m > 2 of possible colours of balls which 
we call types 0,... ,m — 1. Initially, there is one ball of an arbitrary type. In each step we 
draw a ball from the urn, uniformly from within the balls in the urn and independently of 
the history of the urn process. If its type is j E {0,..., m — 1} it is placed back to the urn 
together with a new ball of type j + 1 mod m. We denote by Rn = {Rnfl, ■ ■ ■, Rn,m-iY the 
(column) vector of the numbers of balls of each type after n steps when starting with one ball 
of type 0. Hence, we have Rq = cq where Cj denotes the j-th unit vector in M™’, indexing the 
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unit vectors by 0,..., m — 1. For fixed m > 2 we denote the m-th elementary root of unity 
by w := exp(^). Furthermore we set 

Afc := = cos := = sin , 

\ m J \ m J 

0 </t < m - 1. (1) 

mV / 

Note that uq = —1 := —(1,1,... , 1)* G M™. 

The asymptotic distributional behavior of the sequence {Rn)n>o has been identified in 
Janson [ZlEli, see also Pouyanne [la IS]. Janson also developed a limit theory for the 
compositions of rather general urn schemes. For simplicity of presentation we state the 
case when starting with one ball of type 0. However, when starting with one ball of type 
j G {0 ,... ,m — 1}, the corresponding composition vector Rn^ is obtained in distribution by 
the relation 

4^1 = {ny Rn, 0 < j < m - 1, (2) 

where the replacement matrix TZ is defined in Q- Hence, it is sufficient to consider the cyclic 
urn process started with one ball of colour 0. An extension to initially having more than one 
ball is straightforward, see the discussion in [10 p. 1165]. 

For the cyclic urns Janson showed that for 2 < m < 6 the normalized composition vector 
Rn converges in distribution towards a multivariate normal distribution, whereas for m > 7 
there is no convergence by a conventionally standardized version of the Rn due to subtle 
periodicities. For m>7 there exists a complex valued random variable Hi (depending on m) 
such that almost surely, as n —oo, we have 

R — —1 

"„A.” (3) 

We now focus on the periodic case m>7. According to (|3|) the normalization {Rn — 
^1) does not converge but is (strongly) approximated by the oscillating random sequence 
(2$R(n*^iHiUi))n>o- In the present paper we clarify whether it is still possible that the fluc¬ 
tuations of the n~^^{Rn — ^1) around the periodic sequence (2J?(n*^iHiui))n>o do converge 
although the sequence itself does not converge. Subsequently, we will call the differences in 
pi) residuals. 

Our main results stated in Theorems 11.11 and o show that the nature of the asymptotic 
behavior of the residuals in (|3|) depends on the number of colours m. For m G {7,8, 9,10,11} 
there is a direct normalization which implies a multivariate central limit law (CLT) for the 
residuals. The case m = 12 also allows a multivariate CLT with a different scaling. For 
m > 12 the residuals cannot directly by normalized to obtain convergence. However, consid¬ 
ering refined residuals allows a multivariate CLT for all m > 12. This in fact gives a more 
refined expansion of the Rn, cf. Theorems 11.11 and 11.21 There is a further subtlety in the 
nature of the fluctuations of the residuals: If 6 divides m the fluctuations of the residuals are 
asymptotically supported by a two-dimensional plane, i.e., the covariance matrix of the limit 
normal distribution has rank 2, whereas for all m > 7 which are not divided by 6 this support 
is a hyperplane (rank m — 1). 

By —^ (and =) convergence (resp. equality) in distribution are denoted, for a symmetric 
positive semi-definite matrix M by A7(0, M) the centered normal distribution with covariance 
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matrix M. For v E C™' we denote by v* the conjugate transpose of v. Furthermore, 6 | m 
and 6 f m is short for 6 divides (resp. does not divide) m. 

We distinguish the cases 6 | m and 6 f m as follows: 

Theorem 1.1. Let m > 7 with 6 f m and set r := [(m — l)/6j. Then, there exist complex 
valued random variables Hi,..., such that, as n ^ oo, we have 

^Ai-i/2 ^ A Af (o, . 

The covariance matrix has rank m — 1 and is given by 

m—1 

= E 

k=l ' ^ ' 

When 6 | m the normalization requires an additional y/\ogn factor and the rank of the 
covariance matrix is reduced to 2: 


Theorem 1.2. Let m>7 with 6 | m and set r := [(m — l)/6j. Then, there exist complex 
valued random variables Hi,..., such that, as n ^ oo, we have 


( Rn-nRn] 

\/log(n) \ 


k=l 



The covariance matrix has rank 2 and is given by 

+ '^5m/6^^5m/6- 

The convergences in Theorems 1 1.1 1 and 1 1.2 1 also hold with all moments. For an expansion 
of see l]6|). 

We consider Theorems II.11 and II.21 as prototypical for a phenomenon which we conjecture 
to occur frequently in related random combinatorial structures. E.g., we expect similar be¬ 
havior for the size of random m-ary search trees, cf. lamE], and for the number of leaves in 
random d-dimensional (point) quadtrees [2]. (For both instances only the case of Theorem 
o is expected to occur.) 


2 Outline of the proof 


In this section we first recall some known asymptotic behavior of which is used subse¬ 
quently. Then we state a more rehned result on certain projections of residuals in Proposition 
12.II which directly implies Theorems II. II and 11.21 Then, an outline of the proof of Proposition 
12.11 is given. Technical steps and estimates are then sketched in Section [3l Throughout, we 
fix an m > 7. 

The cyclic urn with m colours has the m x m replacement matrix 


n := 


/O 1 0 • • 0 0\ 

0 0 1 • • 0 0 

0 0 0 • • • • 


(4) 


Vl 0 0 
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where TZij indicates that after drawing a ball of type i it is placed back together with TZij 
balls of type j for all 0 < i,j < m — 1. For the urn we consider the initial configuration of one 
ball of type 0 and write Rn for the composition vector after n steps. The canonical filtration 
is given by the cr-fields Rn = criRo, ■ ■ ■, Rn) for n > 0. The dynamics of the urn process imply 
that, almost surely, we have 


m—1 


E [Rn+l I Rn] = ^ 


R 


n,k 


fc =0 


n + 1 


{Rn + R-^^k) — ( Idrrt + 


n + 1 


-7^* Rr, 


n > 0. 


(5) 


Here, Idm denotes the m x m identity matrix and TV' the transpose of TZ. The matrices TZ 
and 1dm + have the same (right) eigenvectors uq, ..., Vm-i given in ©• 

Note that vq has the direction of the drift vector 1 in Theorems 11.11 and 11.21 and vi 
determines the directions of the a.s. fluctuations around the drift there. By diagonalizing 
these matrices and using ([5]) one finds explicit expressions for the mean of the Rn, cf. m 
Lemma 6.7]. With 


Ck ■= 


2 

r(l +0;^)^^’ 


1 < A: < r, 


these expressions imply the expansion, as n —>■ oo, 


E [Rn] = + V + O(V^). 

m 

k=\ 


( 6 ) 


It is also known that the variances and covariances of Rn are of the order with appro¬ 
priate periodic prefactors. This explains the normalization n~^^{Rn — in Theorems 

o and 0 The analysis of the asymptotic distribution as stated in (l3|) has been done by 
different techniques (partly only in a weak sense), by embedding into continuous time mul¬ 
titype branching processes, by (more direct) use of martingale arguments, and by stochastic 
fixed-point arguments, see pimiio]. 

For our further analysis we use a spectral decomposition of the process {Rn)n>o- We 
denote by vr^ the projection onto the eigenspace in C™ spanned by for 0 < A: < m — 1. 
Hence, we have 


Rn = T^kiRn) = '^o{Rn) + {T^k + '^m-k){Rn) + l{m even}^m/2(-?^n)> 

k=0 k=l 

where 1 indicates an indicator. We have deterministically ir^Rn) = fEe other 

projections TTk{Rn) one has similar periodic behavior as for the composition vector Rn, cf. ([3]), 
as long as we have Afc > ^. We call the projections 7rk{Rn) large, if since their 

magnitudes have orders larger than ^/n. Projections vr^ with A^ < ^ we call small. For the 
large projections we have for all 1 < A: < [m/2j with A^ > ^ almost surely that 

yn,k ■= + T^m-k){Rn “ E[i?„]) - 23? HfcUfc) 0 (7) 

with a complex valued random variable The small projections 7rfc(i?„) behave differently, 
see [atti]. For those k with A^ < ^ we have 

Xn,k ■= ^(TTfc + TJ^m-k){Rn “ E[i?J) Ar(0, Sfc), (8) 

y/n 
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with an appropriate covariance matrix see (fTBjl - ffTHI) . 

If m is even then for •= we have a multivariate CLT as in ([8|). 

Finally, if 6 | m, then there is the pair (^, with A^/e = -^ 5 m /6 = case the 

scaling requires an additional -^logn factor. We have 

^n,m/6 • / 1 —i^m/G ^5m/6)i^n y A/"(0, (9) 

^/n log n 

We identify the orders of the variances and covariances of in Section 13.11 These orders 
imply that an appropriate normalization to study the fluctuations of the large projections is 
given by 

Xn,k ■■= n^^-^Yn,k- ( 10 ) 

Now, the Xn^k are defined for all 1 < A: < [m/2j and describe the normalized fluctuations of 
all the projections. For the small projections we already know that they are asymptotically 
normally distributed, see ([8]). As a main contribution of the present paper we show that 
the residuals of the large projections as normalized in (IlOp are also asymptotically normal. 
Moreover, we show that all these fluctuations are jointly asymptotically normally distributed 
and asymptotically independent: 

Proposition 2.1. For the vector {Xn^i ,..., A'n,[m/ 2 j) defined in ^ - 17^) we have 


(A^n,i) ■ ■ ■ 7 t A/^(0, diag(Si,..., )), 

where the blocks of the diagonal block matrix diag(Si,..., ) are defined in 

Proposition 12.11 directly implies Theorems 11.11 and 11.21 

Proof of Theorem \l.l[ Let m > 7 with 6 f m, set r = [(m — l)/6j and let Hi,... as in 
(|8p . Moreover, ..., Al„qm/ 2 J as in Proposition l2.ll Note that 6 f m implies that there is 
no 1 < A: < m with Afc = |. We obtain 




= r,N-l/2 I ^-Ai 


= n 


r 

Y. {(vTfc + TTm-kfiRn " E[i?n]) " HfcUfc) } 


k=l 


rm/2]-l 


+ n {r^k -\-'^m—k){Rn ~^[Rn\) ~\~ ^{m '^m/ 2 {Rn ~ ^\Rr\\) 


r+1 

= Xn^i + • • • + 




by Proposition 12.11 and the continuous mapping theorem, where = Si + • • • + S|^m/ 2 J- 
That S^”*) has rank m — 1 is proven in Theorem 13.51 □ 
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Proof of Theorem M.^ Let m > 7 with 6 | m and Hi,..., as in (l8|) and Xn,i, ■ ■ ■ 

as in Proposition 12.11 Note that 6 | m implies that there is the pair (m/6,5m/6) with 

^m/6 = ^ 5 m /6 = 5 • Rearranging terms as in the proof of Theorem 11.11 we obtain 

^ m/2 

” ^n,m/6 / I / , Xn k 

Vlogn 

k^m / 6 

by Proposition 12.II and Slutzky’s Lemma, where = T^m/e- That has rank 2 is proven 
in Theorem 13.51 □ 

To prove Proposition 12.11 we first derive moments and mixed moments needed for the 
normalization in Section EH The ranks of the covariance matrices are identified in 

Section 13.21 In Section 13.31 a pointwise recursive equation for the complex random variables 
Hi,..., Hr is obtained together with a recurrence for the sequence {Rn)n>o which extends to 
a recurrence for the residuals in ([3]) as well as to the residuals of the projections of the 
Finally, the joint convergence of the normalized residuals of all projections is finally shown by 
an application of a stochastic fixed-point argument in the context of the contraction method 
by use of the Zolotarev metric ^ 3 - However, only an indication and a solid reference are given 
in Section Em 

3 Sketch of the proof of Proposition 12.1 

3.1 Proper normalization of the residuals 

Denoting the inner product in C”^ by (•, •) we first write the spectral decomposition of the 
centered composition vector with respect to the orthonormal basis {y/mvk : 0 < k < m} of 
the unitary vector space C™ as 

m—1 m—1 

Rn - E[R„] = ^ TTfc {Rn - E[R„]) =: ^ Uk (Rn - E[R„]) Ufc. 
k=0 k=0 

The evolution ([5]) of the process implies that the random variables 

(fi -U xl 

Mn,k ■= -g 1 -g ijfc) ~ ^ 

for /c G {0,... , m — 1} \ {m/2} and 

^n,m/2 ■ ^ ■ '^m/2 {Rn E [i?n]) (1^) 

[7I 

define complex-valued, centered martingales. Note, that the corresponding martingales M/ / 
when starting with one ball of type jG{0,...,m — 1} satisfy 

(convention := 


n 


Ai-l/2 
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It is known, see [Hiisiin], that for all k E {0,..., m — 1} with Afc = 5ft > 1 /2, there exists 
a complex random variable such that, as n —oo, we have 


Mn^k ‘^k almost surely, (13) 

where the convergence also holds in Lp for every p > 1. The Mn,k with Afc = 5ft {uj^) < 1/2 
are also known to converge, after proper normalization, to normal limit laws. 

Our subsequent analysis requires asymptotics for moments of and correlations between 
the Uk{Rn)- Exploiting the dynamic of the urn in ([5|) elementary calculations imply that: 

Lemma 3.1. For all k E {0, ..., m — 1} \ {m/2}, we have 


m—1 


E [Uk (Rn)] = V W*^‘E [Rn t] = ) 

[ k\ nji ^ I n,£j r(n +i)r(i+ W^)’ 


t=0 


while 

For all k,i £ {0,... ,m — 1}, 

n 

E [uk {Rn) Ue (Rn)] = n ( 1 + 


E [um/2 (-Rn)] = 0. 


s=l 


UJ^+UJ^ 


n ^ s—1 




' i=s+l 


LV'^ + OJ^ 


s=l i=l 

From Lemma l3.II we obtain the L 2 -distance of the residuals of the martingales {Mn,k)n>o 
with Afc > ^ needed for the proper normalization of these residuals: 

Lemma 3.2. For k > 1 such that A^ > 1/2, as n ^ oo, we have 

\Mn,k - 


E 


2 A £.-1 


n 


l-2Afc 


Lemma 13.21 directly implies the asymptotic covariances of the residuals of the centered 
projections of the composition vector, which we denote by 

r{ri+l+LJ^) 




'^k {Rn '^ki ^k — 2 * 

Note that this notation implies the representation 

T' ( ill k\ ^ 

{Rn-mn])- Y. r(l+l)" -kVk=Y^-,k 


k>l: Afc>l/2 


k=l 


Lemma 13.21 implies: 


Cov (n„,fc) 


Lemma 3.3. For all k £ {1,m — 1} \ {^, ^}, as n ^ oo, we have 

1 


| 2 Afc - 1 


■n • VkVf,. 


(14) 


Cov ~ nlog(n) • 


// 6 I m, then 

Cov ilin^m/G) ~ nlog(n) • VmlQV*^lQ, 
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(15) 












(16) 


This also determines the covariance matrices in Proposition 12.11 We have 

1.1 


Sfc = 


|2Afc-l| ^ |2A^_fc-l| 
for € {1,... , I'm/2] — 1} \ {^} as well as 


■ Vm—kVm—k 


T.^/g ^m/6^m/6 ^5m/6^5m/6’ ® I 


^m/2 


12 Am/2 


• ^^m/ 2 </ 2 > if 2 I m. 


(17) 

(18) 


We also need to control correlations of residuals between different eigenspaces. An explicit 
calculation implies for all k,i > 1 with k £ and A^, A^ > ^ that 

E [{Mn,k - Hfc) (M„,^ - H^)] = O (n-i + . (19) 

The bound (jl9p implies: 

Lemma 3.4. Let k,£ > 1 with k ^ £ and n —>■ oo. If Afc, A^ > | or Afc, then 

Cov (n„,fc,n„/) = o(n). 

If ^k> \ and \i <\ then 

Cov (iiji/j, £) 0. 

These moments estimates are sufficient to subsequently properly scale the projections of 
the residuals and to guarantee the finiteness of the Zolotarev metric Ca used. 


3.2 The rank of the covariance matrices 

The covariance matrices in Theorem 11.11 and [L2] appear as the sums of the covariance 
matrices in (|16p and (IlSp if 6 f m and as the covariance matrix in (1171) if 6 | m. We obtain 
their ranks as follows: 

Theorem 3.5. For 6 f m, the matrix 

m—1 ^ 

- E (20) 

has rank m — 1, while for 6 | m, 

(2f) 

has rank two. 

Proof. Note that the matrix-vector product mvkv’^x is the orthogonal projection of x G C™ 
onto the eigenspace spanned by Vk- Hence, we have 

m—1 

Idm = ^ rnvkvl. 

k=0 

The matrix mS^™) can be interpreted as the orthogonal projection onto spanjui,... ,Vm-i} 
for the case 6 f m and onto the subspace span{um/6) ^Sm/el fo^ 6 | m. Hence, we obtain the 
ranks m — 1 and 2, respectively. □ 






3.3 Embedding into a random binary search tree 

In this section we describe the self-similarity of the martingale limits by deriving an almost 
sure recursive equation for the and a distributional recurrence for the sequence {Rn)n>o 
which extends to a recurrence for the residuals in ([3|) as well as to the normalized residuals 
^n,k of the projections of the Rn- 

For this, we embed the cyclic urn process into a random binary search tree. The random 
binary search tree starts with one external node. In each step one of the external nodes is 
chosen uniformly at random (and independently from the previous choices) and replaced by 
one internal node with two children, the children being external nodes attached along a left 
and right branch. The cyclic urn is embedded into the evolution of the random binary search 
tree by labeling its external nodes by the types of the balls. The initial external node is 
labeled by type 0. Whenever an external node of type j G {0,... , m — 1} is replaced by an 
internal node its (new) left child gets label j, its right child gets label j + 1 mod m. Note, 
that the external nodes of the tree correspond to the balls in the urn. A related embedding 
was exploited in m Section 6.3]. Note that the binary search tree starting with one external 
node labeled 0 decomposes into its left and right subtree starting with external nodes of types 
0 and 1, respectively. The size (number of internal nodes) In of the left subtree is uniformly 
distributed on {0,..., n — 1}. This implies, with := n — 1 — In, the recurrence 



+ R 




^[o],(o) 

in 


+ n^R 


[o],(i) 

Jn ’ 


( 22 ) 


where the sequences (i?n^’^°^)n>o and denote the composition vectors of the cyclic 

urns given by the evolutions of the left and right subtrees of the root of the binary search tree 
(upper indices (0) and (1) denoting left and right subtree, upper indices [0] and [1] denoting 
the initial type). They are independent and independent of Note that the second equation 
in (|2^ is due to ([2]) where the are chosen appropriately for pointwise equality. Now, 

applying the transformation and scaling which turns Rn into ^ to the left and right hand 
side of (j22p . letting n ^ oo and using the convergence in (I13p implies the following recursive 
equation for the 

Proposition 3.6. For all k > 1 with \k > \ there exist independent random variables U, 
such that 

Ek = - Ur'^E^}^ + gk{U), (23) 


where 

“W -f(rb)(““‘+“T-«r‘-i) 

and U has the uniform distribution on [0,1] and E^^'^ and have the same distribution as 
Sfc- 

Alternatively, the martingale limits Ek can be written explicitly as deterministic functions 
of the limit of the random binary search tree when interpreting the evolution of the random 
binary search tree as a transient Markov chain and its limit as a random variable in the 
Markov chain’s Doob-Martin boundary, see HE]. From this representation the self-similarity 
relation (f23]l can be read off as well. 
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3.4 Proving convergence 

Note that the left and right hand sides of (j22|) and (I23|) are linked via the convergence of the 
Mn^k towards This allows to come up with a recurrence for the vector , N^n,[m/ 2 J) 

in Proposition 12.11 The reader is asked to trust the authors that the techniques devel¬ 
oped in [12] for a univariate problem can be extended to the multivariate recurrences for 
{Xn^i ,..., N'„qm/ 2 J) that the same type of proof as in [12] based on the Zolotarev metric 
Cs can be applied. 
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